Stylometric Identification in Electronic Markets: Scalability and Robustness

نویسندگان

  • Ahmed Abbasi
  • Hong-Mei Chen
  • Jay F. Nunamaker
چکیده

Online reputation systems are intended to facilitate the propagation of word of mouth as a credibility scoring mechanism for improved trust in electronic 50 AbbASI, chEN, AND NuNAMAkEr marketplaces. however, they experience two problems attributable to anonymity abuse—easy identity changes and reputation manipulation. In this study, we propose the use of stylometric analysis to help identify online traders based on the writing style traces inherent in their posted feedback comments. We incorporated a rich stylistic feature set and developed the Writeprint technique for detection of anonymous trader identities. The technique and extended feature set were evaluated on a test bed encompassing thousands of feedback comments posted by 200 ebay traders. Experiments conducted to assess the scalability (number of traders) and robustness (against intentional obfuscation) of the proposed approach found it to significantly outperform benchmark stylometric techniques. The results indicate that the proposed method may help militate against easy identity changes and reputation manipulation in electronic markets. key words And phrAses: anti-aliasing, electronic markets, online trust, similarity detection, stylometry. electronic mArkets hAve seen unprecedented growth in recent years. Online auction marketplaces such as ebay are one type of electronic market that has become especially popular. however, the lack of physical contact and prior interaction makes such places more susceptible to opportunistic member behavior [40]. While reputation systems attempt to alleviate some of the troubles with electronic markets, these systems suffer from two problems—easy identity changes and reputation manipulation. Easy identity changes stem from the fact that online traders can create new identities, thereby refreshing their reputation [10]. reputation manipulation allows online market traders to inflate their reputations using multiple identities or to sabotage competitors’ reputation scores. consequently, fraud and deception are highly prevalent in electronic markets, particularly in online auctions, which account for 50 percent of Internet fraud [9]. The aforementioned problems stem from online anonymity. however, individuals leave behind textual traces of their identity in the feedback comments posted to other traders. Stylometric similarity detection techniques applied to reputation system feedback comments can help minimize problems stemming from anonymity abuses in reputation systems. These techniques attempt to assess the degree of similarity between individuals based on writing style. Since text traces are often the only identity cues left behind in cyberspace, researchers have begun to use online stylometric analysis techniques as a forensic tool. They have recently been applied to e-mail, Web forums, and program code [11, 19, 49], as well as group support system comments [21, 22]. Despite significant progress, online stylometry has several current limitations. Most previous work focused on the identification task (where potential authorship identities are known in advance). There has been limited evaluation of similarity detection techniques where no identities are known a priori, and are clustered based on their similarity scores. Similarity detection is more practical for cyberspace applications, such as reputation systems. Furthermore, there has been a lack of evaluation of the scalability of stylometric analysis in terms of number of authors and identities per author for reputation systems. Moreover, there has been a lack of assessment of roSTyLOMETrIc IDENTIFIcATION IN ELEcTrONIc MArkETS 51 bustness against intentional stylistic alteration and message copycatting or forging. In this study, we propose a system that can provide stylometric analysis scalability and robustness for identifying traders in online reputation systems based on their feedback comments posted for others. The proposed system is highly accurate at differentiating across hundreds of identities based on stylistic tendencies inherent in feedback comments, and is also fairly robust against intentional stylistic alteration. The system uses an extended feature set consisting of several static and dynamic feature categories and also includes the Writeprint technique, which assesses the degree of stylistic similarity and dissimilarity between authors. Writeprint uses karhunen–Loeve transforms to assess the degree of similarity between traders and a pattern disruption mechanism to determine stylistic dissimilarity. The system can be used for similarity detection in reputation systems to alleviate the identity change and rank manipulation problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Lawful Framework For Distributed Electronic Markets

While decentralized peer-to-peer market platforms are more suited for trading short-lived or non-material goods (e.g., electrical power, bandwidth-ondemand) due to reduced transaction cost, robustness and scalability, they lack the legal certainty provided by centralized electronic market places operated by a trusted third party. This paper presents a technical framework that, conforming to Eur...

متن کامل

Applying Stylometric Analysis Techniques to Counter Anonymity in Cyberspace

Due to the ubiquitous nature and anonymity abuses in cyberspace, it’s difficult to make criminal identity tracing in cybercrime investigation. Writeprint identification offers a valuable tool to counter anonymity by applying stylometric analysis technique to help identify individuals based on textual traces. In this study, a framework for online writeprint identification is proposed. Variable l...

متن کامل

A Framework for Stylometric Similarity Detection in Online Settings

Online marketplaces and communication media such as email, web sites, forums, and chat rooms have been ubiquitously integrated into our everyday lives. Unfortunately, the anonymous nature of these channels makes them an ideal avenue for online fraud, hackers, and cybercrime. Anonymity and the sheer volume of online content make cyber identity tracing an essential yet strenuous endeavor for Inte...

متن کامل

An agent-based approach to flexible commerce in intermediary-centric electronic markets

The growth of electronic-markets (e-markets) necessitates the processing of many commerce protocols (processes). Each protocol handles different messages and flows of transactions among customers, merchants and intermediaries. Developing applications for each commerce protocol is costly and impractical. Accordingly, developing a flexible model of Internet commerce to support various commerce pr...

متن کامل

A DSS-Based Dynamic Programming for Finding Optimal Markets Using Neural Networks and Pricing

One of the substantial challenges in marketing efforts is determining optimal markets, specifically in market segmentation. The problem is more controversial in electronic commerce and electronic marketing. Consumer behaviour is influenced by different factors and thus varies in different time periods. These dynamic impacts lead to the uncertain behaviour of consumers and therefore harden the t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. of Management Information Systems

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2008